43 research outputs found
Gender and Interest Targeting for Sponsored Post Advertising at Tumblr
As one of the leading platforms for creative content, Tumblr offers
advertisers a unique way of creating brand identity. Advertisers can tell their
story through images, animation, text, music, video, and more, and promote that
content by sponsoring it to appear as an advertisement in the streams of Tumblr
users. In this paper we present a framework that enabled one of the key
targeted advertising components for Tumblr, specifically gender and interest
targeting. We describe the main challenges involved in development of the
framework, which include creating the ground truth for training gender
prediction models, as well as mapping Tumblr content to an interest taxonomy.
For purposes of inferring user interests we propose a novel semi-supervised
neural language model for categorization of Tumblr content (i.e., post tags and
post keywords). The model was trained on a large-scale data set consisting of
6.8 billion user posts, with very limited amount of categorized keywords, and
was shown to have superior performance over the bag-of-words model. We
successfully deployed gender and interest targeting capability in Yahoo
production systems, delivering inference for users that cover more than 90% of
daily activities at Tumblr. Online performance results indicate advantages of
the proposed approach, where we observed 20% lift in user engagement with
sponsored posts as compared to untargeted campaigns.Comment: 10 pages, 9 figures, Proceedings of the 21th ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining (KDD 2015), Sydney,
Australi
Scalable Semantic Matching of Queries to Ads in Sponsored Search Advertising
Sponsored search represents a major source of revenue for web search engines.
This popular advertising model brings a unique possibility for advertisers to
target users' immediate intent communicated through a search query, usually by
displaying their ads alongside organic search results for queries deemed
relevant to their products or services. However, due to a large number of
unique queries it is challenging for advertisers to identify all such relevant
queries. For this reason search engines often provide a service of advanced
matching, which automatically finds additional relevant queries for advertisers
to bid on. We present a novel advanced matching approach based on the idea of
semantic embeddings of queries and ads. The embeddings were learned using a
large data set of user search sessions, consisting of search queries, clicked
ads and search links, while utilizing contextual information such as dwell time
and skipped ads. To address the large-scale nature of our problem, both in
terms of data and vocabulary size, we propose a novel distributed algorithm for
training of the embeddings. Finally, we present an approach for overcoming a
cold-start problem associated with new ads and queries. We report results of
editorial evaluation and online tests on actual search traffic. The results
show that our approach significantly outperforms baselines in terms of
relevance, coverage, and incremental revenue. Lastly, we open-source learned
query embeddings to be used by researchers in computational advertising and
related fields.Comment: 10 pages, 4 figures, 39th International ACM SIGIR Conference on
Research and Development in Information Retrieval, SIGIR 2016, Pisa, Ital
Detection of Active Emergency Vehicles using Per-Frame CNNs and Output Smoothing
While inferring common actor states (such as position or velocity) is an
important and well-explored task of the perception system aboard a self-driving
vehicle (SDV), it may not always provide sufficient information to the SDV.
This is especially true in the case of active emergency vehicles (EVs), where
light-based signals also need to be captured to provide a full context. We
consider this problem and propose a sequential methodology for the detection of
active EVs, using an off-the-shelf CNN model operating at a frame level and a
downstream smoother that accounts for the temporal aspect of flashing EV
lights. We also explore model improvements through data augmentation and
training with additional hard samples